Setup
First, install some new packages.
if(!require(tmap)){install.packages("tmap")}
if(!require(sf)){install.packages("sf")}
if(!require(maps)){install.packages("maps")}
if(!require(mapsf)){install.packages("mapsf")}
Then, activate the ones we need.
library(tidyverse)
library(readxl)
library(tmap) # Geographical plotting
library(gapminder) # Country level data
library(maps) # Geographical datasets
library(sf) # Shape files (sf) are a common geographical format
library(plotly) # Dynamic graphs
library(mapsf) # A cool new package!
Now, load the data (included in the packages). The World dataset contains geographical data.
data("World")
str(World) # structure of World
Classes ‘sf’ and 'data.frame': 177 obs. of 16 variables:
$ iso_a3 : Factor w/ 177 levels "AFG","AGO","ALB",..: 1 2 3 4 5 6 7 8 9 10 ...
$ name : Factor w/ 177 levels "Afghanistan",..: 1 4 2 166 6 7 5 56 8 9 ...
$ sovereignt : Factor w/ 171 levels "Afghanistan",..: 1 4 2 159 6 7 5 52 8 9 ...
$ continent : Factor w/ 8 levels "Africa","Antarctica",..: 3 1 4 3 8 3 2 7 6 4 ...
$ area : Units: [km^2] num 652860 1246700 27400 71252 2736690 ...
$ pop_est : num 28400000 12799293 3639453 4798491 40913584 ...
$ pop_est_dens: num 43.5 10.3 132.8 67.3 15 ...
$ economy : Factor w/ 7 levels "1. Developed region: G7",..: 7 7 6 6 5 6 6 6 2 2 ...
$ income_grp : Factor w/ 5 levels "1. High income: OECD",..: 5 3 4 2 3 4 2 2 1 1 ...
$ gdp_cap_est : num 784 8618 5993 38408 14027 ...
$ life_exp : num 59.7 NA 77.3 NA 75.9 ...
$ well_being : num 3.8 NA 5.5 NA 6.5 4.3 NA NA 7.2 7.4 ...
$ footprint : num 0.79 NA 2.21 NA 3.14 2.23 NA NA 9.31 6.06 ...
$ inequality : num 0.427 NA 0.165 NA 0.164 ...
$ HPI : num 20.2 NA 36.8 NA 35.2 ...
$ geometry :sfc_MULTIPOLYGON of length 177; first list element: List of 1
..$ :List of 1
.. ..$ : num [1:69, 1:2] 5310471 5408503 5470647 5477147 5541607 ...
..- attr(*, "class")= chr [1:3] "XY" "MULTIPOLYGON" "sfg"
- attr(*, "sf_column")= chr "geometry"
- attr(*, "agr")= Factor w/ 3 levels "constant","aggregate",..: NA NA NA NA NA NA NA NA NA NA ...
..- attr(*, "names")= chr [1:15] "iso_a3" "name" "sovereignt" "continent" ...
data("gapminder")
head(gapminder)
The world data is very rich: it has many attributes (population, economy, etc.). The one column we are interested in is geometry: it contains the data for the maps!
NOTE: there are 2 very useful packages that allow to retrieve data from the World Bank: WID and wbstats: have a look!
We see that a few points are missing for life expectancy, so let’s see if we can complete with the gapminder data (spoiler: not so much!).
This exercise (see below) of grouping two datasets is paramount when plotting geographical content because usually, the geometric properties come from one file and the items we wish to plot come from another file. Thus, the joining step is KEY.
Areas
Maps in which areas are coloured to show some numerical attribute are called choropleths.
With the tmap package
The idea below is to merge the two sources to combine geographical forms and life expectancy. To do that, we need left_join(). But before, we need to create a common variable (i.e., a key) between the two. In World, the term “name” is not so great so let’s change it into “country” (just like in gapminder!). Then we can merge!
names(World)[2] <- "country" # Change the column name
gapminder %>%
left_join(World, by = "country") %>% # Join the two datasets
select(country, lifeExp, pop, geometry)
nrow(gapminder) # Number of instances
[1] 1704
There are missing points. As a rough approximation, let’s just remove them and store the result in data (more on them later).
data <- gapminder %>% # Join and save via the arrow operator <-
left_join(World, by = "country") %>%
na.omit() # Remove rows with missing data
nrow(data) # Number of instances in dataset
[1] 1296
The loss is large because of missing points! We then use the tmap package to create a choropleth.

Some countries are missing => data problem because gapminder is incomplete. Let’s complete it via: https://www.gapminder.org/data/ This yields a much bigger dataset, called “gap2.RData”. Let’s see:
load("gap2.RData")
head(gap2)
It’s highly tidy! Let’s un-tidy it a bit to put it in the original gapminder format.
gap2 <- gap2 %>% pivot_wider(names_from = "indicator", values_from = "value")
Again, we resort to left_join to merge the two data sources.
data <- World %>%
left_join(gap2, by = "country") #%>%
#na.omit()
With the mapsf package
References:
https://riatelab.github.io/mapsf/articles/mapsf.html
https://github.com/riatelab/mapsf
It’s super easy: you just need a shape file with the attribute you want to plot!
data %>%
filter(year == 2018) %>% # Keep only one year (problems otherwise)
sf::st_sf() %>%
mf_map(x = ., var = "lifeExp", type = "choro")

The package does not really use pipes.
map <- data %>%
filter(year == 2018) %>% # Keep only one year (problems otherwise)
sf::st_sf()
my_theme <- list(
name = "mytheme",
bg = "#DDEEFF", # Light blue
fg = "#444444", # Dark grey
mar = c(0, 0, 0, 0),
tab = TRUE,
pos = "center",
inner = TRUE,
line = 1,
cex = .9,
font = 3
)
mf_theme(my_theme)
mf_map(x = map, type = "base")
mf_map(x = map,
add = T, # After you created a map with mf_map, you need this!
var = c("inequality", "well_being"), # Two variables! BUT BE CAREFUL!
type = "prop_choro",
inches = 0.1, # Scale of circles
lwd = 1, # Line width
breaks = "equal",
nbreaks = 6, # Number of colors breaks
leg_title_cex = c(1,1), # Size of legend title
leg_val_cex = c(0.5,0.8), # Size of legend text
pal = "Spectral", # Palette
leg_pos = c("bottomleft", "left"), # Legend position
leg_title= c("Inequality","Well being"), # Legend titles
leg_val_rnd = c(2, 2), # Number of decimals in legends
leg_frame = c(TRUE, TRUE)) # Frames around legend?
mf_layout(title = "Inequality & well-being in the world",
credits = paste0("Sources: MISC"),
frame = TRUE)

With the leaflet package
Next, we move towards interactive maps with the leaflet package. The advantage of leaflet lies in its multiple options that create dynamic plots. The example below is strongly inspired from the reference page: https://rstudio.github.io/leaflet/choropleths.html Thus, looking at the difference between the two codes helps understand how they work.
We are going two use two features:
- a customized color scale;
- labels that appear when the cursor is on the map.
For the scale, we define it manually, both for the colors and for the separation points (bins).
The labels are defined using html functions.
One word on missing data: in the World data, “Czech Rep.” stands for “Czech Republic”, so we recode manually it to fill in a hole. Since it’s a factor in World, we need to use the recode_factor function. For simple text, recode would be sufficient.
if(!require(leaflet)){install.packages("leaflet")} # Install package
library(leaflet) # Load package
datamap <- World %>%
mutate(country = country %>% recode_factor(`Czech Rep.` = "Czech Republic")) %>%
left_join(gap2, by = "country") %>%
filter(year == 2011) # Keep only one year (recent)
palet <- colorBin("YlGnBu", # Yellow-Green-Blue palette
domain = datamap %>% pull(lifeExp), # LifeExp # Domain of labels: lifeExp
bins = c(50, 55, 60, 65, 70, 80, 90)) # Age categories
labels <- sprintf( # Below we define the labels
"<strong>%s</strong><br/>%g Years", # Adding text to label
datamap$country, # We show the country name...
datamap$lifeExp # ... and the life expectancy
) %>% lapply(htmltools::HTML) # Embedded all into html language
We can then move forward with the plot.
Better, but some African countries are still missing… Data problem!
It is possible to integrate leaflet into shiny: https://rstudio.github.io/leaflet/shiny.html You need renderLeaflet() in the server and leafletOutput() in the UI.
Points (with ggplot!)
Points are more easy to handle because they require less information (two coordinates at least plus other attributes like size or color). This is exactly the kind of data type that fits in the ggplot syntax! Below, we show how to combine ggplot with geographical data.
Country level data
While areas must be defined at the country level (complex!), points are simply characterized by two numbers: latitude & longitude. So a list of these two features suffices to plot points. For countries, it is possible to find data online, ex: https://developers.google.com/public-data/docs/canonical/countries_csv
The latitudes and longitudes correspond to the center of the countries.
Below, we create a new dataset with longitudes & latitutes combined with gapminder (updated version).
if(!require(readxl)){install.packages("readxl")} # Install package to read excel files
library(readxl) # Activate package
options(scipen=999) # Remove scientific notation
country_LL <- read_excel("country_LL.xlsx") # Read longitude/latitude data
data2 <- left_join(gap2, country_LL) %>% # Merge two datasets
na.omit() %>% # Remove missing points
mutate(latitude = as.numeric(latitude), # 'Numerize' latitude
longitude = as.numeric(longitude), # 'Numerize' longitude
population = round(population / 10^6,1)) # Scaling population into millions
data2 %>% head()
Be careful, the code below is computationally demanding (be patient). We use the viridis color palette. In ggplot, a small list of maps is available: world, France, Italy, New Zealand, US (and states).
if(!require(scales)){install.packages("scales")}
if(!require(viridis)){install.packages("viridis")} # Package for color gradient
library(viridis)
library(scales)
world <- map_data("world")
g <- ggplot(data = world) + # Data source
geom_polygon(data = world, aes(x = long, # Dark background of the world
y = lat,
group = group)) +
geom_point(data = data2 %>% filter(year == 2018), # Most recent data points
aes(x = longitude, # Classical ggplot syntax!
y = latitude,
size = population,
color = lifeExp,
label = country)) + # Label (for plotly)
scale_color_viridis(option = "magma") # Color scale
ggplotly(g) # Plotly integration
g <- ggplot(data = world) + # Data source
geom_polygon(data = world, aes(x = long, # Dark background of the world
y = lat,
group = group)) +
geom_point(data = data2 %>% filter(year == 2018), # Most recent data points
aes(x = longitude, # Classical ggplot syntax!
y = latitude,
size = population,
color = lifeExp,
label = country)) + # Label (for plotly)
scale_color_viridis(option = "magma") # Color scale
City level data
You can find longitudes & latitudes by searching on Google. Here’s one example for France: https://simplemaps.com/data/fr-cities Worldwide data can be accessed here: https://simplemaps.com/data/world-cities
cities <- read_excel("worldcities.xlsx")
That’s too big. Let’s focus on Germany, Italy, Austria, Slovenia and Switzerland (arbitrarily).
cities_short <- cities %>%
filter(country %in% c("Germany", "Italy", "Switzerland", "Austria", "Slovenia"),
population > 250000) %>%
na.omit()
head(cities_short)
world <- map_data("world")
g <- ggplot(data = world) + # Data source
geom_polygon(data = world, # Dark background of the world
aes(x = long, y = lat, group = group),
size = 0.5, color = "grey") + # Outer lines of polygons
geom_point(data = cities_short, # The points
aes(x = lng,
y = lat,
size = population,
color = country,
label = city)) +
#scale_colour_gradient(low = "#FFFB00", high = "#FF1B00") + # This line is for color = population
scale_colour_manual(values = c("#FFEC00", "#66E234", "#1BCBF7", "#CC44F8", "#FC2361")) +
coord_sf(xlim = c(5, 18), ylim = c(36, 57), expand = FALSE) # Zoom on Europe
ggplotly(g)
Other maps
For points, it’s easy: just use the same coordinate system via longitude & latitude. It’s much more tricky for areas.
The key is to find shape files (one alternative format is GeoJSON, but it is not treated here).
Example: http://datapages.com/gis-map-publishing-program/gis-open-files/global-framework/ Be very careful: some maps are EXTREMELY precise (up to 1m precision). Thus, the files are heavy and the plotting takes a LOT of time. Below, we use 100m precision, which is largely enough.
The full description of all shape file items can be accessed on wikipedia:
- .shp — shape format; the feature geometry itself
- .shx — shape index format; a positional index of the feature geometry to allow seeking forwards and backwards quickly
- .dbf — attribute format; columnar attributes for each shape, in dBase IV format
regions <- sf::st_read("regions-20140306-100m.shp")
Reading layer `regions-20140306-100m' from data source `/Users/coqueret/Documents/IT/COURS/2021/R4DS/S7_extensions/Geocomputing/regions-20140306-100m.shp' using driver `ESRI Shapefile'
Simple feature collection with 27 features and 10 fields
geometry type: MULTIPOLYGON
dimension: XY
bbox: xmin: -61.80976 ymin: -21.38973 xmax: 55.83665 ymax: 51.08984
geographic CRS: WGS 84
str(regions) # Structure of the variable region
Classes ‘sf’ and 'data.frame': 27 obs. of 11 variables:
$ code_insee: chr "42" "72" "83" "25" ...
$ nom : chr "Alsace" "Aquitaine" "Auvergne" "Basse-Normandie" ...
$ nom_cl : chr "Strasbourg" "Bordeaux" "Clermont-Ferrand" "Caen" ...
$ insee_cl : chr "67482" "33063" "63113" "14118" ...
$ nuts2 : chr "FR42" "FR61" "FR72" "FR25" ...
$ iso3166_2 : chr NA NA NA NA ...
$ wikipedia : chr "fr:Alsace" "fr:Aquitaine" "fr:Auvergne" "fr:Basse-Normandie" ...
$ nb_dep : num 2 5 4 3 4 4 6 4 2 4 ...
$ nb_comm : num 904 2296 1310 1812 2046 ...
$ surf_km2 : num 8328 41818 26172 17786 31752 ...
$ geometry :sfc_MULTIPOLYGON of length 27; first list element: List of 1
..$ :List of 1
.. ..$ : num [1:760, 1:2] 7.43 7.43 7.42 7.42 7.4 ...
..- attr(*, "class")= chr [1:3] "XY" "MULTIPOLYGON" "sfg"
- attr(*, "sf_column")= chr "geometry"
- attr(*, "agr")= Factor w/ 3 levels "constant","aggregate",..: NA NA NA NA NA NA NA NA NA NA
..- attr(*, "names")= chr [1:10] "code_insee" "nom" "nom_cl" "insee_cl" ...
That’s complicated. The only important column for now is the name of the region which is easy to recognize. Below, we plot a choropleth where the color is link to the area of the region (bigger = darker).
regions %>%
tm_shape() +
tm_polygons("surf_km2")

That’s prety ugly. Let’s remove the regions outside the mainland.
regions %>%
filter(! (nom %in% c("Guyane", "Mayotte", "Martinique", "Guadeloupe", "La Réunion"))) %>%
tm_shape() +
tm_polygons("surf_km2", border.col = "white") # With the border of polygons

Below, we change the color palette and plot the number of French “départements” inside each region.
regions %>%
filter(! (nom %in% c("Guyane", "Mayotte", "Martinique", "Guadeloupe", "La Réunion"))) %>%
tm_shape() +
tm_fill("nb_dep", palette = "Spectral") # Without the border

This is static: leaflet integration would be better.
Highcharts
The material below is inspired from the vignette https://cran.r-project.org/web/packages/highcharter/vignettes/charting-maps.html
The highcharter package creates good looking plots. Let’s install it (and activate it).
if(!require(highcharter)){install.packages("highcharter")}
library(highcharter)
The package has a large library of maps. The list is given here: https://code.highcharts.com/mapdata/
If you click on a link, you get the address of the map:

The last part is what you specify to get access to a map:
mapdata <- get_data_from_map(download_map_data("countries/fr/custom/fr-all-mainland"))
trying URL 'https://code.highcharts.com/mapdata/countries/fr/custom/fr-all-mainland.js'
Content type 'text/javascript' length 23676 bytes (23 KB)
==================================================
downloaded 23 KB
head(mapdata)
The information that will be plotted comes from this file. We need to add the value that we want to use for the plot. There are 21 regions in the variable map_data. This bypasses the merging of external data via the left_join() function.
mapdata <- mapdata %>% mutate(value = 1:nrow(mapdata)) # Completely random values
head(mapdata, 20)
Below, we plot the regions of France and the color code is driven by the column “value” defined above.
hcmap("countries/fr/custom/fr-all-mainland", # Source for the map
data = mapdata, # Source for the color code
value = "value", # Column for color code
joinBy = c("name"), # Common name & label
name = "Random Plot", # Name of plot
dataLabels = list(enabled = TRUE, format = "{point.name}"), # Allow labels
borderColor = "#FAFAFA", borderWidth = 0.5, # Border info
tooltip = list(valueDecimals = 1, # Label info
valuePrefix = "Value = ", # Before label
valueSuffix = " Units")) # After label
trying URL 'https://code.highcharts.com/mapdata/countries/fr/custom/fr-all-mainland.js'
Content type 'text/javascript' length 23676 bytes (23 KB)
==================================================
downloaded 23 KB
Fine grain local maps
Now let’s turn to smaller scale maps. The data comes from Kaggle, scrapped from tripadvisor. The useful columns: Longitude & Latitude, the MuseumName, the Rating and the LengthOfVisit.
With leaflet, it’s easy to obtain empty maps as long as you know the longitude & latitude of a particular location. Below: Lyon.
tripadvisor <- read_excel("tripadvisor_museum_US.xlsx") %>% # Importing the data
mutate(Longitude = as.numeric(Longitude), # Getting numbers back
Latitude = as.numeric(Latitude),
Rating = as.numeric(Rating),
LengthOfVisit = as.factor(LengthOfVisit)) %>%
filter(Latitude > 40, Latitude < 41, Longitude > (-74.5), Longitude < (-73.5)) # NY museums only
NAs introduced by coercionNAs introduced by coercion
tripadvisor
leaflet() %>% # An empty map: a good start!
setView(lng = 4.85, lat = 45.75, zoom = 12) %>%
addTiles()
Below, you need to specify lon and lat in the source dataframe. Or you can use the hcaes() function
mapdata2 <- get_data_from_map(download_map_data("countries/us/custom/us-ny-congress-113"))
trying URL 'https://code.highcharts.com/mapdata/countries/us/custom/us-ny-congress-113.js'
Content type 'text/javascript' length 23596 bytes (23 KB)
==================================================
downloaded 23 KB
hcmap("countries/us/custom/us-ny-congress-113") %>%
hc_mapNavigation(enabled = TRUE,
buttonOptions = list(
align = "right",
verticalAlign = "bottom")) %>%
hc_add_series(data = tripadvisor %>% mutate(lon = Longitude, lat = Latitude, name = MuseumName), # Add correct names
type = "mappoint", name = "Museums")
trying URL 'https://code.highcharts.com/mapdata/countries/us/custom/us-ny-congress-113.js'
Content type 'text/javascript' length 23596 bytes (23 KB)
==================================================
downloaded 23 KB
Not as impressive as leaflet, but the map could probably be improved. NOTE: in the example above, the geographical background and the points are unrelated.
A test with GeoJSON
GeoJSON are, with shapefiles the other main standard in geocomputing.
It’s easy to obtain files, e.g., via https://geojson-maps.ash.ms Also, you will need the rgdal package.
africa <- rgdal::readOGR("africa.geo.json")
OGR data source with driver: GeoJSON
Source: "/Users/coqueret/Documents/IT/COURS/2021/R4DS/S7_extensions/Geocomputing/africa.geo.json", layer: "africa.geo"
with 51 features
It has 64 fields
---
title: "Geocomputing with R"
output:
  html_notebook:
    toc: true
    toc_float: true
---


# Setup

First, install some new packages.


```{r, message = FALSE, warning = FALSE}
if(!require(tmap)){install.packages("tmap")}
if(!require(sf)){install.packages("sf")}
if(!require(maps)){install.packages("maps")}
if(!require(mapsf)){install.packages("mapsf")}
```


Then, activate the ones we need.

```{r, message = FALSE, warning = FALSE}
library(tidyverse)
library(readxl)
library(tmap)        # Geographical plotting
library(gapminder)   # Country level data
library(maps)        # Geographical datasets
library(sf)          # Shape files (sf) are a common geographical format 
library(plotly)      # Dynamic graphs
library(mapsf)       # A cool new package!
```


Now, load the data (included in the packages). The World dataset contains geographical data. 

```{r}
data("World")        # From tmap!
str(World)           # structure of World
data("gapminder")
head(gapminder)
```

The world data is very rich: it has many attributes (population, economy, etc.). The one column we are interested in is **geometry**: it contains the data for the maps!

**NOTE**: there are 2 very useful packages that allow to retrieve data from the World Bank: **WID** and **wbstats**: have a look!

We see that a few points are missing for life expectancy, so let's see if we can complete with the gapminder data (spoiler: not so much!).

**This exercise (see below) of grouping two datasets is paramount when plotting geographical content because usually, the geometric properties come from one file and the items we wish to plot come from another file. Thus, the joining step is KEY.**

# Areas

Maps in which areas are coloured to show some numerical attribute are called *choropleths*.

## With the tmap package

The idea below is to merge the two sources to combine geographical forms and life expectancy. To do that, we need **left_join**(). But before, we need to create a common variable (i.e., a *key*) between the two. In World, the term "name" is not so great so let's change it into "country" (just like in gapminder!). Then we can merge!

```{r, message = FALSE, warning = FALSE}
names(World)[2] <- "country"                    # Change the column name
gapminder %>% 
  left_join(World, by = "country") %>%          # Join the two datasets
  select(country, lifeExp, pop, geometry)
nrow(gapminder)                                 # Number of instances
```

There are missing points. As a rough approximation, let's just remove them and store the result in **data** (more on them later). 

```{r, message = FALSE, warning = FALSE}
data <- gapminder %>%                     # Join and save via the arrow operator <- 
  left_join(World, by = "country") %>%
  na.omit()                               # Remove rows with missing data
nrow(data)                                # Number of instances in dataset
```

The loss is large because of missing points! We then use the *tmap* package to create a choropleth.


```{r}
data %>% 
  filter(year == 2007) %>%    # Keep only one year (problems otherwise)
  sf::st_sf() %>%             # Transform data into shape file (sf) format
  tm_shape() +                # Tranforms the sf into a tmap object
  tm_polygons("lifeExp",      # Plot!
              alpha = 0.9,
              palette = "Spectral")      
```


Some countries are missing => data problem because gapminder is incomplete.
Let's complete it via: https://www.gapminder.org/data/
This yields a much bigger dataset, called "gap2.RData". Let's see:

```{r}
load("gap2.RData")
head(gap2)
```

It's highly tidy! Let's un-tidy it a bit to put it in the original *gapminder* format. 

```{r}
gap2 <- gap2 %>% pivot_wider(names_from = "indicator", values_from = "value")
```

Again, we resort to left_join to merge the two data sources.

```{r, message = FALSE, warning = FALSE}
data <- World %>% 
  left_join(gap2, by = "country") #%>%
#na.omit() 
```


## With the **mapsf** package

**References**:  
https://riatelab.github.io/mapsf/articles/mapsf.html  
https://github.com/riatelab/mapsf  

It's super easy: you just need a shape file with the attribute you want to plot!

```{r, message = F, warning = F}
data %>% 
  filter(year == 2018) %>%    # Keep only one year (problems otherwise)
  sf::st_sf() %>%
  mf_map(x = ., var = "lifeExp", type = "choro")
```

The package does not really use pipes.

```{r, message = F, warning = F}
map <- data %>% 
  filter(year == 2018) %>%    # Keep only one year (problems otherwise)
  sf::st_sf()

my_theme <- list(
  name = "mytheme", 
  bg = "#DDEEFF",        # Light blue
  fg = "#444444",        # Dark grey
  mar = c(0, 0, 0, 0), 
  tab = TRUE, 
  pos = "center", 
  inner = TRUE, 
  line = 1, 
  cex = .9, 
  font = 3
)

mf_theme(my_theme)
mf_map(x = map, type = "base")
mf_map(x = map,
       add = T,             # After you created a map with mf_map, you need this!
       var = c("inequality", "well_being"),      # Two variables! BUT BE CAREFUL!
       type = "prop_choro",
       inches = 0.1,                             # Scale of circles
       lwd = 1,                                  # Line width
       breaks = "equal", 
       nbreaks = 6,                              # Number of colors breaks
       leg_title_cex = c(1,1),                   # Size of legend title
       leg_val_cex = c(0.5,0.8),                 # Size of legend text
       pal = "Spectral",                         # Palette
       leg_pos = c("bottomleft", "left"),        # Legend position
       leg_title= c("Inequality","Well being"),  # Legend titles
       leg_val_rnd = c(2, 2),                    # Number of decimals in legends
       leg_frame = c(TRUE, TRUE))                # Frames around legend?
mf_layout(title = "Inequality & well-being in the world", 
          credits = paste0("Sources: MISC"), 
          frame = TRUE)
```


## With the **leaflet** package

Next, we move towards interactive maps with the *leaflet* package. The advantage of *leaflet* lies in its multiple options that create dynamic plots. The example below is strongly inspired from the reference page: https://rstudio.github.io/leaflet/choropleths.html
Thus, looking at the difference between the two codes helps understand how they work. 

We are going two use two features:   
- a customized color scale;    
- labels that appear when the cursor is on the map.

For the scale, we define it manually, both for the colors and for the separation points (bins).   
The labels are defined using html functions.

One word on missing data: in the *World* data, "Czech Rep." stands for "Czech Republic", so we recode manually it to fill in a hole. Since it's a factor in *World*, we need to use the **recode_factor** function. For simple text, **recode** would be sufficient.

```{r, message = FALSE, warning = FALSE}
if(!require(leaflet)){install.packages("leaflet")}                  # Install package
library(leaflet)                                                    # Load package
datamap <- World %>% 
  mutate(country = country %>% recode_factor(`Czech Rep.` = "Czech Republic")) %>%
  left_join(gap2, by = "country") %>%
  filter(year == 2011)                         # Keep only one year (recent)

palet <- colorBin("YlGnBu",                                         # Yellow-Green-Blue palette
                  domain = datamap %>% pull(lifeExp), # LifeExp     # Domain of labels: lifeExp
                  bins = c(50, 55, 60, 65, 70, 80, 90))             # Age categories

labels <- sprintf(                                                  # Below we define the labels
  "<strong>%s</strong><br/>%g Years",                               # Adding text to label
  datamap$country,                                                  # We show the country name...
  datamap$lifeExp                                                   # ... and the life expectancy
) %>% lapply(htmltools::HTML)                                       # Embedded all into html language
```

We can then move forward with the plot.

```{r, message = FALSE, warning = FALSE}
datamap %>% 
  data.frame() %>%                                # Turn into dataframe (technical)
  sf::st_sf() %>%                                 # Format in sf
  st_transform("+init=epsg:4326") %>%             # Convert in particular coordinate reference 
  leaflet() %>%                                   # Call leaflet
  addPolygons(fillColor = ~palet(lifeExp),        # Create the map (colored polygons)
              weight = 2,                         # Width of separation line
              opacity = 1,                        # Opacity of separation line
              color = "white",                    # Color of separation line
              dashArray = "3",                    # Dash size of separation line
              fillOpacity = 0.7,                  # Opacity of polygon colors
              highlight = highlightOptions(       # 5 lines below control the cursor impact
                weight = 2,                       # Width of line
                color = "#EEEEEE",                # Color of line
                dashArray = "",                   # No dash
                fillOpacity = 0.9,                # Opacity
                bringToFront = TRUE),
              label = labels,                     # LABEL! Defined above!
              labelOptions = labelOptions(        # Label options below...
                style = list("font-weight" = "normal", padding = "3px 8px"),
                textsize = "15px",
                direction = "auto")
  ) %>%
  addLegend(pal = palet,                    # Legend: comes from palet colors defined above
            values = ~lifeExp,              # Values come from lifeExp variable
            opacity = 0.9,                  # Opacity of legend
            title = "Map Legend",           # Title of legend
            position = "bottomright")       # Position of legend
```


Better, but some African countries are still missing... Data problem!

It is possible to integrate *leaflet* into *shiny*: https://rstudio.github.io/leaflet/shiny.html
You need **renderLeaflet**() in the server and **leafletOutput**() in the UI.


# Points (with ggplot!)

Points are more easy to handle because they require less information (two coordinates at least plus other attributes like size or color). This is exactly the kind of data type that fits in the **ggplot** syntax! Below, we show how to combine ggplot with geographical data.

## Country level data

While areas must be defined at the country level (complex!), points are simply characterized by two numbers: **latitude** & **longitude**. So a list of these two features suffices to plot points. For countries, it is possible to find data online, ex: https://developers.google.com/public-data/docs/canonical/countries_csv

The latitudes and longitudes correspond to the center of the countries.

Below, we create a new dataset with longitudes & latitutes combined with *gapminder* (updated version).
```{r, message = FALSE, warning = FALSE}
if(!require(readxl)){install.packages("readxl")}    # Install package to read excel files
library(readxl)                                     # Activate package
options(scipen=999)                                 # Remove scientific notation
country_LL <- read_excel("country_LL.xlsx")         # Read longitude/latitude data
data2 <- left_join(gap2, country_LL) %>%            # Merge two datasets
  na.omit() %>%                                     # Remove missing points
  mutate(latitude = as.numeric(latitude),           # 'Numerize' latitude
         longitude = as.numeric(longitude),         # 'Numerize' longitude
         population = round(population / 10^6,1))   # Scaling population into millions
data2 %>% head()
```

Be careful, the code below is computationally demanding (be patient). We use the *viridis* color palette. 
In *ggplot*, a small list of maps is available: world, France, Italy, New Zealand, US (and states). 

```{r, message = FALSE, warning = FALSE}
if(!require(scales)){install.packages("scales")}
if(!require(viridis)){install.packages("viridis")} # Package for color gradient
library(viridis)                                    
library(scales) 
world <- map_data("world")
g <- ggplot(data = world) +                               # Data source
  geom_polygon(data = world, aes(x = long,                # Dark background of the world
                                 y = lat, 
                                 group = group)) +   
  geom_point(data = data2 %>% filter(year == 2018),       # Most recent data points
             aes(x = longitude,                           # Classical ggplot syntax!
                 y = latitude, 
                 size = population, 
                 color = lifeExp, 
                 label = country)) +                      # Label (for plotly)
  scale_color_viridis(option = "magma")                   # Color scale
ggplotly(g)                                               # Plotly integration
```

## City level data

You can find longitudes & latitudes by searching on Google. Here's one example for France:
https://simplemaps.com/data/fr-cities
Worldwide data can be accessed here: https://simplemaps.com/data/world-cities

```{r, message = FALSE, warning = FALSE}
cities <- read_excel("worldcities.xlsx")
```

That's too big. Let's focus on Germany, Italy, Austria, Slovenia and Switzerland (arbitrarily).

```{r}
cities_short <- cities %>% 
  filter(country %in% c("Germany", "Italy", "Switzerland", "Austria", "Slovenia"),
         population > 250000) %>%
  na.omit()
head(cities_short)
```

```{r, message = FALSE, warning = FALSE}
world <- map_data("world")
g <- ggplot(data = world) +                                     # Data source
  geom_polygon(data = world,                                    # Dark background of the world
               aes(x = long, y = lat, group = group),
               size = 0.5, color = "grey") +                    # Outer lines of polygons
  geom_point(data = cities_short,                               # The points
             aes(x = lng, 
                 y = lat, 
                 size = population, 
                 color = country, 
                 label = city)) +
  #scale_colour_gradient(low = "#FFFB00", high = "#FF1B00") +  # This line is for color = population
  scale_colour_manual(values = c("#FFEC00", "#66E234", "#1BCBF7", "#CC44F8", "#FC2361")) +
  coord_sf(xlim = c(5, 18), ylim = c(36, 57), expand = FALSE)             # Zoom on Europe
ggplotly(g)
```


# Other maps

For points, it's easy: just use the same coordinate system via longitude & latitude. It's much more tricky for areas.

The key is to find **shape files** (one alternative format is GeoJSON, but it is not treated here).  
Example: http://datapages.com/gis-map-publishing-program/gis-open-files/global-framework/ 
Be very careful: some maps are *EXTREMELY* precise (up to 1m precision). Thus, the files are heavy and the plotting takes a *LOT* of time. Below, we use 100m precision, which is largely enough.

The full description of all shape file items can be accessed on wikipedia:

* .shp — shape format; the feature geometry itself   
* .shx — shape index format; a positional index of the feature geometry to allow seeking forwards and backwards quickly   
* .dbf — attribute format; columnar attributes for each shape, in dBase IV format

```{r, message = FALSE, warning = FALSE}
regions <- sf::st_read("regions-20140306-100m.shp")
str(regions)    # Structure of the variable region
```

That's complicated. The only important column for now is the *name* of the region which is easy to recognize. Below, we plot a choropleth where the color is link to the area of the region (bigger = darker).

```{r}
regions %>% 
  tm_shape() +
  tm_polygons("surf_km2")
```

That's prety ugly. Let's remove the regions outside the mainland. 

```{r}
regions %>%
  filter(! (nom %in% c("Guyane", "Mayotte", "Martinique", "Guadeloupe", "La Réunion"))) %>% 
  tm_shape() +
  tm_polygons("surf_km2", border.col = "white") # With the border of polygons
```

Below, we change the color palette and plot the number of French "départements" inside each region.

```{r}
regions %>%
  filter(! (nom %in% c("Guyane", "Mayotte", "Martinique", "Guadeloupe", "La Réunion"))) %>% 
  tm_shape() +
  tm_fill("nb_dep", palette = "Spectral")  # Without the border
```

This is static: *leaflet* integration would be better.

# Highcharts

The material below is inspired from the vignette https://cran.r-project.org/web/packages/highcharter/vignettes/charting-maps.html

The *highcharter* package creates good looking plots. Let's install it (and activate it).

```{r, message = FALSE, warning = TRUE}
if(!require(highcharter)){install.packages("highcharter")}
library(highcharter)
```

The package has a large library of maps. The list is given here:
https://code.highcharts.com/mapdata/

If you click on a link, you get the address of the map:

![](high_list.png)

The last part is what you specify to get access to a map:

```{r, message = FALSE, warning = FALSE}
mapdata <- get_data_from_map(download_map_data("countries/fr/custom/fr-all-mainland"))
head(mapdata)
```

The information that will be plotted comes from this file. We need to add the value that we want to use for the plot. There are 21 regions in the variable *map_data*. This bypasses the merging of external data via the **left_join**() function.

```{r, message = FALSE, warning = FALSE}
mapdata <- mapdata %>% mutate(value = 1:nrow(mapdata)) # Completely random values
head(mapdata, 20)
```

Below, we plot the regions of France and the color code is driven by the column "value" defined above. 

```{r, message = FALSE, warning = FALSE}
hcmap("countries/fr/custom/fr-all-mainland",                      # Source for the map
      data = mapdata,                                             # Source for the color code
      value = "value",                                            # Column for color code
      joinBy = c("name"),                                         # Common name & label
      name = "Random Plot",                                       # Name of plot
      dataLabels = list(enabled = TRUE, format = "{point.name}"), # Allow labels
      borderColor = "#FAFAFA", borderWidth = 0.5,                 # Border info
      tooltip = list(valueDecimals = 1,                           # Label info
                     valuePrefix = "Value = ",                    # Before label
                     valueSuffix = " Units"))                     # After label
```


# Fine grain local maps

Now let's turn to smaller scale maps. The data comes from Kaggle, scrapped from tripadvisor. The useful columns: *Longitude* & *Latitude*, the *MuseumName*, the *Rating* and the *LengthOfVisit*. 

With leaflet, it's easy to obtain empty maps as long as you know the longitude & latitude of a particular location. Below: **Lyon**.

```{r}
tripadvisor <- read_excel("tripadvisor_museum_US.xlsx") %>%    # Importing the data
  mutate(Longitude = as.numeric(Longitude),                    # Getting numbers back
         Latitude = as.numeric(Latitude),
         Rating = as.numeric(Rating),
         LengthOfVisit = as.factor(LengthOfVisit)) %>%
  filter(Latitude > 40, Latitude < 41, Longitude > (-74.5), Longitude < (-73.5)) # NY museums only
tripadvisor

leaflet() %>%                                                  # An empty map: a good start!
  setView(lng = 4.85, lat = 45.75, zoom = 12) %>% 
  addTiles()
```

```{r}
pal <- colorFactor(palette = "Spectral", domain = tripadvisor$LengthOfVisit)
tripadvisor %>%
  leaflet() %>%                                                   # A blank sheet...
  setView(lng = -73.9772, lat = 40.7808, zoom = 10) %>% 
  addTiles() %>%
  addCircles(lng = ~Longitude, lat = ~Latitude, 
             radius = ~Rating^4, label =  ~MuseumName,
             fillColor = ~pal(LengthOfVisit), fillOpacity = 0.9,  # Circle colors
             stroke = TRUE, color = "black", weight = 2) %>%      # Circle stroke
  addLegend(pal = pal,                      # Legend: comes from palet colors defined above
            values = ~LengthOfVisit,        # Values come from lifeExp variable
            opacity = 0.7,                  # Opacity of legend
            title = "Map Legend",           # Title of legend
            position = "bottomright")       # Position of legend
```

Below, you need to specify **lon** and **lat** in the source dataframe. Or you can use the *hcaes*() function

```{r}
mapdata2 <- get_data_from_map(download_map_data("countries/us/custom/us-ny-congress-113"))
hcmap("countries/us/custom/us-ny-congress-113") %>%
  hc_mapNavigation(enabled = TRUE,
                   buttonOptions = list(
                     align = "right",
                     verticalAlign = "bottom")) %>%
  hc_add_series(data = tripadvisor %>% mutate(lon = Longitude, lat = Latitude, name = MuseumName),  # Add correct names
                type = "mappoint", name = "Museums") 

```

Not as impressive as leaflet, but the map could probably be improved. 
**NOTE**: in the example above, the geographical background and the points are unrelated.

# Resources

Below, a link of great pages & tutorials:

**Geocomputation with R** (online book): https://geocompr.robinlovelace.net    
**ggplot**!: https://ggplot2.tidyverse.org/reference/map_data.htmlPure     
**sf** (shape files): https://cran.r-project.org/web/packages/sf/vignettes/sf1.html   
**sf & R**: https://r-spatial.github.io/sf/articles/sf1.html   
**sf & ggplot**: https://www.r-spatial.org/r/2018/10/25/ggplot2-sf-2.html   
**list of sf**: http://datapages.com/gis-map-publishing-program/gis-open-files/global-framework/global-heat-flow-database/shapefiles-list  
**tmap**: https://cran.r-project.org/web/packages/tmap/vignettes/tmap-getstarted.html   
**ggmap**: https://github.com/dkahle/ggmap    
**leaflet**: https://rstudio.github.io/leaflet/    
**Oregon** (tutorial): http://geog.uoregon.edu/bartlein/courses/geog495/lec06.html    




# A test with GeoJSON

GeoJSON are, with shapefiles the other main standard in geocomputing.   
It's easy to obtain files, e.g., via https://geojson-maps.ash.ms
Also, you will need the **rgdal** package.


```{r, message = F, warning = F}
library(rgdal)
africa <- rgdal::readOGR("africa.geo.json")
```
```{r}
pal <- colorNumeric("viridis", NULL)

leaflet(africa) %>%
  addTiles() %>%
  addPolygons(fillOpacity = 0.7,
              fillColor = ~pal(pop_est),
              weight = 2,
              opacity = 1,
              color = "white") %>%
  addLegend(pal = pal, values = ~pop_est, opacity = 0.7, title = NULL,
    position = "bottomleft")
```

